Cooperation for Arabic Language Resources and Tools - The MEDAR Project

نویسندگان

  • Bente Maegaard
  • Mohammed Attia
  • Khalid Choukri
  • Olivier Hamon
  • Steven Krauwer
  • Mustafa Yaseen
چکیده

The paper describes some of the work carried out within the European funded project MEDAR. The project has three streams of activity: the technical stream, the cooperation stream and the dissemination stream. MEDAR has first updated the existing surveys and BLARK for Arabic, and then the technical stream focused on machine translation. The consortium identified a number of freely available MT systems and then customized two versions of the famous MOSES package. The Consortium addressed the needs to package MOSES for English to Arabic (while the main MT stream is on Arabic to English). For performance assessment purposes, the partners produced test data that allowed carrying out an evaluation campaign with 5 different systems (including from outside the consortium) and two online ones. Both the MT baselines and the collected data will be made available via ELRA catalogue. The cooperation stream focuses mostly on the cooperation roadmap for Human Language Technologies for Arabic. Cooperation Roadmap for the region directed towards the Arabic HLT in general. It is the purpose of the roadmap to outline areas and priorities for collaboration, in terms of collaboration between EU countries and Arabic speaking countries, as well as cooperation in general: between countries, between universities, and last but not least between universities and industry. 1. Background and Mission The goals of the MEDAR project are the production and availability of shareable LRs and tools, the advancement of Arabic language technology, in particular multilingual resources and tools and the collaboration between institutions from the Euro-Mediterranean countries towards these goals. We are now almost through the project and can see how it has worked. We believe that the approach can be generalised and applied successfully in other regions. MEDAR is structured in three overlapping ‘streams’: 1) the technical stream, 2) the Cooperation Roadmap stream, and 3) the dissemination stream. This paper has its main focus on the Cooperation Roadmap.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MEDAR: Collaboration between European and Mediterranean Arabic Partners to Support the Development of Language Technology for Arabic

After the successful completion of the NEMLAR project 2003-2005, a new opportunity for a project was opened by the European Commission, and a group of largely the same partners is now executing the MEDAR project. MEDAR will be updating the surveys and BLARK for Arabic already made, and will then focus on machine translation (and other tools for translation) and information retrieval with a focu...

متن کامل

NEMLAR - An Arabic Language Resources Project

The NEMLAR project is a European Commission supported project with partners from the EU and from Arabic speaking countries in the Mediterranean region. The project aims at surveying the stat-of-the artof language resources and tools for Arabic in the region, at developing a BLARK definition for Arabic, and at starting development of language resources or updating of existing language resources....

متن کامل

Evaluation Methodology and Results for English-to-Arabic MT

This paper describes the evaluation campaign of the MEDAR project for English-to-Arabic (EnAr) MT systems. The campaign aimed at establishing some basic facts about the state of the art for MT on EnAr, collecting enough data to better train and tune systems and assessing the improvements made. The paper details the data used and their formats, the evaluation methodology and the results obtained...

متن کامل

Al - Khalil : The Arabic Linguistic Ontology Project

We present in this paper our project to building an ontology centered infrastructure for Arabic resources and applications. The core of this infrastructure is a linguistic ontology that is founded on Arabic Traditional Grammar. The methodology we have chosen consists in reusing an existing ontology, namely the Gold linguistic ontology. We discuss the development of the ontology and present our ...

متن کامل

The resource-constraint project scheduling problem of the project subcontractors in a cooperative environment: Highway construction case study

Large-scale projects often have several activities which are performed by subcontractors with limited multi-resources. Project scheduling with limited resources is one of the most famous problems in the research operations and optimization cases. The resource-constraint project scheduling problem (RCPSP) is a NP-hard problem in which the activities of a project must be scheduled to reduce the p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010